AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.39)

Neural Information Processing SystemsDec-24-2025, 01:42:47 GMT

SNAP: Self-Supervised Neural Maps for Visual Positioning and Semantic Understanding

Semantic 2D maps are commonly used by humans and machines for navigation purposes, whether it's walking or driving. However, these maps have limitations: they lack detail, often contain inaccuracies, and are difficult to create and maintain, especially in an automated fashion. Can we use to automatically create that can be easily interpreted by both humans and machines? We introduce SNAP, a deep network that learns rich 2D maps from ground-level and overhead images. We train our model to align neural maps estimated from different inputs, supervised only with camera poses over tens of millions of StreetView images. SNAP can resolve the location of challenging image queries beyond the reach of traditional methods, outperforming the state of the art in localization by a large margin. Moreover, our neural maps encode not only geometry and appearance but also high-level semantics, discovered without explicit supervision. This enables effective pre-training for data-efficient semantic scene understanding, with the potential to unlock cost-efficient creation of more detailed maps.

artificial intelligence, machine learning, proceedings, (3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Neural Information Processing SystemsDec-23-2025, 20:09:04 GMT

Communicating Natural Programs to Humans and Machines

The Abstraction and Reasoning Corpus (ARC) is a set of procedural tasks that tests an agent's ability to flexibly solve novel problems. While most ARC tasks are easy for humans, they are challenging for state-of-the-art AI. What makes building intelligent systems that can generalize to novel situations such as ARC difficult? We posit that the answer might be found by studying the difference of $\textit{language}$: While humans readily generate and interpret instructions in a general language, computer systems are shackled to a narrow domain-specific language that they can precisely execute. We present LARC, the $\textit{Language-complete ARC}$: a collection of natural language descriptions by a group of human participants who instruct each other on how to solve ARC tasks using language alone, which contains successful instructions for 88\% of the ARC tasks. We analyze the collected instructions as `natural programs', finding that while they resemble computer programs, they are distinct in two ways: First, they contain a wide range of primitives; Second, they frequently leverage communicative strategies beyond directly executable codes. We demonstrate that these two distinctions prevent current program synthesis techniques from leveraging LARC to its full potential, and give concrete suggestions on how to build the next-generation program synthesizers.

communicating natural program, human and machine, name change, (6 more...)

Technology: Information Technology > Artificial Intelligence (1.00)

The New YorkerAug-9-2025, 10:00:00 GMT

What It's Like to Brainstorm with a Bot

Contrary to what many of my friends believe, good academics are always working--at least in the sense that when we're stuck on a problem, which is most of the time, it's impossible to leave it behind. A worthwhile problem is a brainworm: it stays with you until it's resolved or replaced by another one. My Dartmouth colleague Luke Chang, a neuroscientist who studies what happens in people's heads when we communicate, is no stranger to this affliction. One day, on a long drive back to Hanover, he found himself preoccupied with such a worm. The drive up I-89 is usually uneventful--a straight shot north, ideal for letting your mind off the leash. But Luke's mind snagged on a technical challenge: how to turn a decent model of facial expression into something truly convincing. The aim was to encode the various nuanced ways human faces transmit states of mind, and then to visualize them; smiles and frowns are the barest beginning. The spectrum of human emotions and intentions is embodied in a range of expressions which serve as a basic alphabet for communication.

chatgpt, claude, collaborator, (17 more...)

The New Yorker

Industry: Health & Medicine > Therapeutic Area (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.97)
Information Technology > Artificial Intelligence > Cognitive Science (0.67)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.34)

arXiv.org Artificial IntelligenceJun-24-2025

Conceptualization, Operationalization, and Measurement of Machine Companionship: A Scoping Review

Banks, Jaime, Li, Zhixin

The notion of machine companions has long been embedded in social-technological imaginaries. Recent advances in AI have moved those media musings into believable sociality manifested in interfaces, robotic bodies, and devices. Those machines are often referred to colloquially as "companions" yet there is little careful engagement of machine companionship (MC) as a formal concept or measured variable. This PRISMA-guided scoping review systematically samples, surveys, and synthesizes current scholarly works on MC (N = 71; 2017-2025), to that end. Works varied widely in considerations of MC according to guiding theories, dimensions of a-priori specified properties (subjectively positive, sustained over time, co-active, autotelic), and in measured concepts (with more than 50 distinct measured variables). WE ultimately offer a literature-guided definition of MC as an autotelic, coordinated connection between human and machine that unfolds over time and is subjectively positive.

artificial intelligence, chatbot, natural language, (18 more...)

2506.18119

Country:

Europe (1.00)
Asia (1.00)
North America > United States (0.69)

Genre: Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Therapeutic Area (0.67)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
(2 more...)

arXiv.org Artificial IntelligenceJun-4-2025

Designing Algorithmic Delegates: The Role of Indistinguishability in Human-AI Handoff

Greenwood, Sophie, Levy, Karen, Barocas, Solon, Heidari, Hoda, Kleinberg, Jon

As AI technologies improve, people are increasingly willing to delegate tasks to AI agents. In many cases, the human decision-maker chooses whether to delegate to an AI agent based on properties of the specific instance of the decision-making problem they are facing. Since humans typically lack full awareness of all the factors relevant to this choice for a given decision-making instance, they perform a kind of categorization by treating indistinguishable instances -- those that have the same observable features -- as the same. In this paper, we define the problem of designing the optimal algorithmic delegate in the presence of categories. This is an important dimension in the design of algorithms to work with humans, since we show that the optimal delegate can be an arbitrarily better teammate than the optimal standalone algorithmic agent. The solution to this optimal delegation problem is not obvious: we discover that this problem is fundamentally combinatorial, and illustrate the complex relationship between the optimal design and the properties of the decision-making task even in simple settings. Indeed, we show that finding the optimal delegate is computationally hard in general. However, we are able to find efficient algorithms for producing the optimal delegate in several broad cases of the problem, including when the optimal action may be decomposed into functions of features observed by the human and the algorithm. Finally, we run computational experiments to simulate a designer updating an algorithmic delegate over time to be optimized for when it is actually adopted by users, and show that while this process does not recover the optimal delegate in general, the resulting delegate often performs quite well.

artificial intelligence, delegate, machine learning, (16 more...)

2506.03102

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > California > Santa Clara County > Stanford (0.06)
North America > United States > New York > Tompkins County > Ithaca (0.04)
(7 more...)

Genre:

Research Report (0.64)
Workflow (0.45)

Industry: Health & Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceMar-13-2025

JPEG Compliant Compression for Both Human and Machine, A Report

Ye, Linfeng

Deep Neural Networks (DNNs) have become an integral part of our daily lives, especially in vision-related applications. However, the conventional lossy image compression algorithms are primarily designed for the Human Vision System (HVS), which can non-trivially compromise the DNNs' validation accuracy after compression, as noted in \cite{liu2018deepn}. Thus developing an image compression algorithm for both human and machine (DNNs) is on the horizon. To address the challenge mentioned above, in this paper, we first formulate the image compression as a multi-objective optimization problem which take both human and machine prespectives into account, then we solve it by linear combination, and proposed a novel distortion measure for both human and machine, dubbed Human and Machine-Oriented Error (HMOE). After that, we develop Human And Machine Oriented Soft Decision Quantization (HMOSDQ) based on HMOE, a lossy image compression algorithm for both human and machine (DNNs), and fully complied with JPEG format. In order to evaluate the performance of HMOSDQ, finally we conduct the experiments for two pre-trained well-known DNN-based image classifiers named Alexnet \cite{Alexnet} and VGG-16 \cite{simonyan2014VGG} on two subsets of the ImageNet \cite{deng2009imagenet} validation set: one subset included images with shorter side in the range of 496 to 512, while the other included images with shorter side in the range of 376 to 384. Our results demonstrate that HMOSDQ outperforms the default JPEG algorithm in terms of rate-accuracy and rate-distortion performance. For the Alexnet comparing with the default JPEG algorithm, HMOSDQ can improve the validation accuracy by more than $0.81\%$ at $0.61$ BPP, or equivalently reduce the compression rate of default JPEG by $9.6\times$ while maintaining the same validation accuracy.

accuracy, algorithm, sensitivity, (15 more...)

2503.10912

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
Europe > Switzerland (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

New ScientistFeb-19-2025, 18:00:00 GMT

Why I'm deeply sceptical about comparisons between humans and machines

Artificial intelligence has humans beat – at least when it comes to games like chess and Go, identifying the 3D structure of proteins, generating investment strategies…the list goes on and on. Some argue that models like ChatGPT are already at the threshold of human intelligence. OpenAI head Sam Altman even threw his unborn child under the bus, claiming "my kid is never gonna grow up being smarter than AI". The capabilities of modern AI are certainly impressive, but I am deeply sceptical about comparisons between humans and machines.

deep learning, human and machine, large language model, (4 more...)

New Scientist

Industry: Leisure & Entertainment > Games > Chess (0.34)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Neural Information Processing SystemsJan-18-2025, 09:23:58 GMT

Tracking Without Re-recognition in Humans and Machines

Imagine trying to track one particular fruitfly in a swarm of hundreds. Higher biological visual systems have evolved to track moving objects by relying on both their appearance and their motion trajectories. We investigate if state-of-the-art spatiotemporal deep neural networks are capable of the same. For this, we introduce PathTracker, a synthetic visual challenge that asks human observers and machines to track a target object in the midst of identical-looking "distractor" objects. While humans effortlessly learn PathTracker and generalize to systematic variations in task design, deep networks struggle.

human and machine, re-recognition, tracking, (1 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.42)

Pratama, Nardiena A., Fan, Shaoyang, Demartini, Gianluca

Perception of Visual Content: Differences Between Humans and Foundation Models

arXiv.org Artificial IntelligenceNov-28-2024

Human-annotated content is often used to train machine learning (ML) models. However, recently, language and multi-modal foundational models have been used to replace and scale-up human annotator's efforts. This study compares human-generated and ML-generated annotations of images representing diverse socio-economic contexts. We aim to understand differences in perception and identify potential biases in content interpretation. Our dataset comprises images of people from various geographical regions and income levels washing their hands. We compare human and ML-generated annotations semantically and evaluate their impact on predictive models. Our results show low similarity between human and machine annotations from a low-level perspective, i.e., types of words that appear and sentence structures, but are alike in how similar or dissimilar they perceive images across different regions. Additionally, human annotations resulted in best overall and most balanced region classification performance on the class level, while ML Objects and ML Captions performed best for income regression. Humans and machines' similarity in their lack of bias when perceiving images highlights how they are more alike than what was initially perceived. The superior and fairer performance of using human annotations for region classification and machine annotations for income regression show how important the quality of the images and the discriminative features in the annotations are.

annotation, machine learning, natural language, (17 more...)

2411.18968

Country:

Asia (0.05)
Europe (0.05)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)